Video Streaming at Scale 🎥

Core Concept

Key Insight: Streaming platforms don't fetch entire videos first—they use chunked delivery and adaptive streaming for immediate playback.

1. Streaming vs Downloading

Method	Approach	User Experience
Downloading	Fetch entire file first	Wait minutes for GB files
Streaming	Fetch small chunks progressively	Playback starts in seconds

Why streaming wins: Videos can be 1-10GB+. Nobody waits 5 minutes to start watching.

2. Video Processing Pipeline

Step 1: Encoding & Segmentation

# Original video processing
original_video.mp4
├── 144p/  (low bitrate)
├── 360p/  (medium bitrate)
├── 720p/  (high bitrate)
├── 1080p/ (HD bitrate)
└── 4K/    (ultra-high bitrate)

# Each resolution split into chunks
720p/
├── chunk_001.ts (2-10 seconds)
├── chunk_002.ts (2-10 seconds)
├── chunk_003.ts (2-10 seconds)
└── ...

Step 2: Manifest Creation

HLS Example (.m3u8 playlist):

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:10
#EXTINF:10.0,
chunk_001.ts
#EXTINF:10.0,
chunk_002.ts
#EXTINF:10.0,
chunk_003.ts

3. Adaptive Bitrate Streaming (ABR)

How It Works

Monitor network speed continuously
Switch quality based on bandwidth
Seamless transitions between resolutions

ABR Decision Logic

if (bandwidth > 5Mbps) {
  quality = "1080p"
} else if (bandwidth > 2Mbps) {
  quality = "720p"
} else if (bandwidth > 1Mbps) {
  quality = "480p"
} else {
  quality = "240p"
}

Real-World Example

User starts video:
chunk1_360p.ts → plays immediately
chunk2_720p.ts → bandwidth good, upgrade
chunk3_720p.ts → continues HD
chunk4_480p.ts → network congestion, downgrade

4. Streaming Protocols

HLS (HTTP Live Streaming)

Created by: Apple
Format: .m3u8 playlists + .ts segments
Support: iOS, Safari, most platforms
Latency: 6-30 seconds

DASH (Dynamic Adaptive Streaming)

Created by: ISO standard
Format: .mpd manifests + various segments
Support: Wide browser support
Latency: 2-30 seconds

WebRTC (Real-time)

Use case: Live streaming, video calls
Latency: Sub-second
Trade-off: Higher complexity, less scalable

5. CDN Architecture

Origin Server (1x)
├── Video processing
├── Master storage
└── Initial upload

Edge Servers (100s-1000s)
├── Geographic distribution
├── Cached video chunks
└── Low-latency delivery

User Request Flow:
User → Nearest Edge Server → Chunk Delivery

CDN Benefits

Reduced latency: Serve from nearby servers
Load distribution: Spread traffic across nodes
Redundancy: Multiple copies prevent outages
Bandwidth efficiency: Cache popular content locally

6. Client-Side Streaming

Buffer Management

// Typical streaming player logic
const BUFFER_SIZE = 30; // seconds ahead
const MIN_BUFFER = 5; // minimum before stalling

function manageBuffer() {
  if (currentBuffer < MIN_BUFFER) {
    fetchNextChunks(3); // emergency fetch
  } else if (currentBuffer < BUFFER_SIZE) {
    fetchNextChunks(1); // normal operation
  }
  // else: buffer full, pause fetching
}

Quality Switching

Upward switching: Gradual (avoid wasted bandwidth)
Downward switching: Immediate (prevent stalling)
Chunk boundaries: Quality changes only between segments

7. Advanced Optimizations

Byte-Range Requests

GET /video.mp4 HTTP/1.1
Range: bytes=2048000-4095999

Fetch specific byte ranges from single file
Useful for seeking to specific timestamps
Alternative to chunk-based streaming

Preloading Strategies

Predictive buffering: Fetch likely next chunks
Quality pre-loading: Cache multiple resolutions
Seek optimization: Pre-load keyframes for scrubbing

Modern Codecs

Codec	Compression	Quality	CPU Load
H.264	Standard	Good	Low
H.265/HEVC	50% better	Excellent	Medium
AV1	30% better than H.265	Best	High

8. Platform Examples

YouTube

Protocol: DASH primarily
Resolutions: 144p → 8K
Strategy: Start low (360p), upgrade quickly
Innovation: VP9/AV1 codecs for efficiency

Netflix

Protocol: Custom DASH implementation
Pre-encoding: 15+ quality levels per title
CDN: Custom Open Connect network
Optimization: Per-title encoding optimization

Twitch (Live)

Protocol: HLS for playback, RTMP for ingest
Latency: 3-5 second delay
Transcoding: Real-time quality variants
Challenge: Live = can't pre-process chunks

9. Performance Metrics

User Experience Metrics

Startup time: Time to first frame (<2s target)
Rebuffering ratio: % of playback time stalled (<1%)
Quality switches: Frequency of resolution changes
Bandwidth efficiency: Data used vs video length

Technical Metrics

Chunk fetch time: Network request latency
Decode performance: Client-side processing speed
Cache hit ratio: CDN efficiency (>90% target)
Origin load: Traffic reaching source servers

10. Implementation Considerations

Client Implementation

// Modern HTML5 video with HLS.js
import Hls from 'hls.js';

const video = document.querySelector('video');
const hls = new Hls({
  maxBufferLength: 30, // 30s buffer
  maxMaxBufferLength: 60, // absolute max
  startLevel: -1, // auto quality
  capLevelToPlayerSize: true, // match screen resolution
});

hls.loadSource('playlist.m3u8');
hls.attachMedia(video);

Server-Side Considerations

Concurrent connections: Handle thousands of simultaneous streams
Geographic distribution: CDN placement strategy
Storage costs: Balance quality vs storage requirements
Processing power: Real-time transcoding for live content

Key Takeaways

✅ Never download entire videos — chunk-based delivery is essential ✅ Adaptive quality — match network conditions dynamically ✅ CDN distribution — serve from edge servers for low latency ✅ Multiple protocols — HLS/DASH for different use cases ✅ Buffer management — balance startup time vs smooth playback ✅ Modern codecs — H.265/AV1 for better compression ✅ Monitoring — track performance metrics for optimization

Bottom line: Streaming at scale requires sophisticated infrastructure, but the user just sees "instant" video playback.

Core Concept​

1. Streaming vs Downloading​

2. Video Processing Pipeline​

Step 1: Encoding & Segmentation​

Step 2: Manifest Creation​

3. Adaptive Bitrate Streaming (ABR)​

How It Works​

ABR Decision Logic​

Real-World Example​

4. Streaming Protocols​

HLS (HTTP Live Streaming)​

DASH (Dynamic Adaptive Streaming)​

WebRTC (Real-time)​

5. CDN Architecture​

CDN Benefits​

6. Client-Side Streaming​

Buffer Management​

Quality Switching​

7. Advanced Optimizations​

Byte-Range Requests​

Preloading Strategies​

Modern Codecs​

8. Platform Examples​

YouTube​

Netflix​

Twitch (Live)​

9. Performance Metrics​

User Experience Metrics​

Technical Metrics​

10. Implementation Considerations​

Client Implementation​

Server-Side Considerations​

Key Takeaways​